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Abstract 

The extremal index 9, a measure of the degree of local dependence in the extremes 
of a stationary process, plays an important role in extreme value analyses. We estimate 
6 semiparametrically, using the relationship between the distribution of block maxima 
and the marginal distribution of a process to define a semiparametric model. We show 
that these semiparametric estimators are simpler and substantially more efficient than 
their parametric counterparts. We seek to improve efficiency further using maxima 
over sliding blocks. A simulation study shows that the semiparametric estimators are 
competitive with the leading estimators. An application to sea-surge heights combines 
inferences about 9 with a standard extreme value analysis of block maxima to estimate 
marginal quantiles. 

Keywords. Block maxima ; extremal index ; extreme value theory ; sea-surge heights ; 
semiparametric estimation 


1 Introduction 


The modelling of rare events in stationary processes is important in many application areas. 
The extremal behaviour of such a process is governed by its marginal distribution and by 
its extremal dependence structure. IChavez-Demonlin and DavisonI (120121 ) provide a review of 
this area, concen t rating on the latter aspect. For processes satisfying the D{un) condition of 
Leadbetter et all (119831 1. which limits long-range dependence at extreme levels, the extreme 
value index ^ € [0,1] is the pr i mary measure of short-range extremal dependence. 

Following Iheadbetter et all (jl983l) . let Xi,X 2 ,... be a strictly stationary sequence of ran¬ 
dom variables that satishes the D[un) condition and has marginal distribution function F. 
Let Mb = max(Xi,... ,Xb). In the non-degenerate case when 0 > 0, for large b and Ub the 
distribution function Gb of the block maximum Mb is approximately related to F via 


Gbiub) = P{Mb ^ Ub) ^ F'^Cub). 


( 1 ) 


Further, if there exist normalizing constants Cb and db such that F^[cbX F db) —f G(x), as 
b —)■ oo, then G{x) is the distribution function of a Generalized Extreme Value (GEV) 
distribution. The corresponding result for M^ = max(X^,..., X*), where XJ'jX^,... are 
independent variables with distribution function F, gives the limiting distribution function 
H{x) = G{x)^P. Thus, the limiting distributions of Mb and M^ are GEV, with respective 
location, scale and shape parameters say, related by 


ge = y + a {e^-l)/C ae = (T9^. ( 2 ) 

We consider a block size dependent index ( Smithl . 1992 1 6b = — logG(Mb)/log2, where, with¬ 
out loss of generality, F^{ub) = 1/2. Thus, 9b ^ 9 as b ^ oo. We will only make explicit the 
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dependence of 0 on 6 when necessa r y. Ancona-Navarrete and Tawn ( 20001) (threshold dep 


)en- 


dent index 0{u)) and [Robert et all fj2009f) (block size and threshold dependent index 9b{ub)) 


consider similar sub-asymptotic forms. Unfortunately no general theory exists concerning the 
rate of convergence to 9 of these quantities. Some of the bias in estimating 9 is because 9b 
(or 9{u) or 9h(uh )), rather than 9, is estimated. 

As noted by iBeirlant et all (120041) . ignoring 9 leads to (a) underestimation of marginal 


quantiles of F implied by inferences about G from block maxima, and (b) overestimation of 
quantiles of G implied by inferences about F fro m, for example, a threshold-based analysis of 


raw data. IChavez-Demoulin and DavisonI (120121) note that 9 contains information about the 


Recent advances in the estimation of 9 

(Ferro and S( 

^erers. 

2003 


Siiveges. 

20Q7; 

Siiveges and DavisonI. 

2010; 

Lanrini and Tawn 

, 2 

003l 

Robert. 

2013 

) have concentrated on 


thre shold methods , based on exceedences of a thresho ld. The improvement of maxima meth¬ 
ods (iGomesl. Il993l: lAncona-Navarrete and Tawn). 120001) . based on ([2]) and described in Section 
11.11 has received less attention. We propose a new maxima method that is simpler and has 
much greater statistical efficiency than existing maxima methods. 


1.1 Maxima methods 

Parametric maxima methods are based on htting GEV distributions to two sets of maxima of 
b consecutive observations. The hrst sample Mj, i = 1,..., n is block maxima of the original 
series. The second sample M*,i = l,...,n is block maxima of a series obtained by ran¬ 
domizing the index of the original series, to obtain approximately an ind ependen t series with 


the same marginal distribution as the original sequence. Based on (|2]), iGomesI (1l993l) £t a 


GEV(/i6i, distribution to {M} and a GEV(/i, a, ^) di stribution to {M*| and construct the 
estimator 9g = where f = {a — ag)/{'ll —'jle). Ancona-Navarrete and Tawi] ( 2000[l 

combine the two GEV hts into one by maximizing a likelihood (with respect to {fi,a,^,9)), 
assuming that {{M*}, {M}) are independent. We call the resulting estimator 9at- In one 
sense parametric maxima methods are anomalous: other methods of estim ating 9 do so di- 
rectly, without embedding 0 in a larger model with nuisance parameters. iNorthropI (120051) 
proposes a semiparametric (SP) maxima estimator. The relationship G = is used but no 
particular parametric form is assumed for G or H. 

An undesirable feature concerning existing maxima methods is the need to resample the 
original data to produce a sample of block maxima with approximate c.d.f. H. In Section 
[2] we show that this is unnecessary: more efficient estimators of 9 can be constructed by 
comparing G directly to F, without generating pseudo-samples from FI. The theoretical gain 
in efficiency is quantified, albeit in an idealized situation, in Appendix A and in Section 12.11 
general properties of the semiparametric estimators are discussed. In Section 12.21 we show 
that on e of th ese estimators is approximately an extended version of the blocks estimator of 


RobertI (120091) . In Sections [3] and S] we carry out simulation studies and an extreme value 


analysis of type 1 on sea-surge data respectively. The paper is concluded in Section [5] with a 
discussion and technical proofs are reported in the Appendix. Computer code to implement 
this methodology is available at www.homepages.ucl. ac.uk/~ucakpjn/, 


2 Semiparametric maxima estimators of 6 

Let Xi,, Xm be strictly stationary sequence of random variables with marginal distribution 
function F and extremal index 9. Let M{s,t] = ma.Xs<k^tXk,nd = \rn/b\ and Ug = m —6-1-1. 
Consider two sets of block maxima: = {Yd^ i = 1,... ,nd}, where Yd = M{{i — l)b,ib] 

{disjoint blocks) and Y^ = {Yf,i = l,...,^^}, where Yf = M{i — l,i -|- 6 — 1] {sliding 
blocks). We use n as general notation for the size of a sample of block maxima. Consider, 
for some s G {0,..., m — 6}, Y = M{s, s -|- 6], the maximum of any block of 6 consecutive 
As, and let V = —b\ogF{Y). When F is known and ([T]) holds then V has an exponential 
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distribution with mean 1/0. The maximum likelihood estimator (MLE) of 6 based on a 
random sample Vi,..., 14, from this distribution is Op = n/ with variance var(0i7’) = 

ri?6‘^{n — 2)~^{n — 1)“^. 

Typically F is unknown, so we must use empirical analogues of V. We describe these using 
the sliding block maxima E®. Let Vi = —b log F{Yf),i = 1,... ,m — b+l and let Bi be the set 
of the Xs that contribute to block maximum Yf. By construction, the b values in Bi cannot 
exceed block maximum X/. To adjust for this deterministic effect we use only the m — b values 
not in Bi to construct the estimator of F applied to Yf. Let k = minxj,^Bi X^. For y ^ k 
we let 

= V—ITYF ^ 


m 


b+1 


Xk^Bi 


where 1(4) is the indicator function of an event A. F-i{Yf) can be expressed in terms of the 
rank Ri G {1,..., m — 6 + 1} of Yj® within Xi,..., X^. If Ri = Tj then rj — 1 of {X^, k ^ B^} 
are larger than Yf and m — 6 + 1 — rj of {X^,, k ^ Bi} are smaller than Yf. Therefore, (|3]) 
gives F_i{Yf) = (m — 6 + 1 — Ri)/{m — 6 + 1). The case Ri = m — b + 1 occurs only when 
Yf < li^ which is unlikely unless 6 is small. To eii sure positivity of F_i{y) we set F_i{y) to 
l/(m — 6 + n + 1) A y < k f Pabrowska et ah 1989). 

Thus, X® produces a sample Vf = —61ogF_i(X/),i = 1,... ,ns, which are determined by 
the respective ranks Ri,i = 1,... ,ns- The disjoi nt block max i ma X ^ produce the subsample 
vf = .i)b+i4 = 1, • • • ,nd- We expect (as in [Robert et all fl2009l) l that X® contains more 

information about 6 than X'^ and thus produces a more efficient estimator of 9. 

Consider block maxima X = (Xi,...,X„) with order statistics X' and let V = {Vj,i = 
1,..., u}, where Vi = —61ogF_i(Xj). The ranks R = (i?i,..., Rn) of X within Xi,..., X^ con¬ 
vey no information about the distribution of X'. Therefore, using a marginal GEV{fi 0 ,ae,^) 
model for the ordered block maxima X', the joint likelihood based on X = (X, X') factorises 
as 

L(0, /ie, ae, 4 Y) = LniO; V) LcEviye, ae, 4 Y'), (4) 

so that independent inferences can be made about 9 and (/xg, ue, ^). We use as an approxi¬ 
mation to Lji{9; V), the pseudo-likelihood 


Le.p(0;X) = 0"exp 



(5) 


that is, the likelihood that would apply if Xi,..., 14 are sampled randomly from an exponential 
distribution with mean 1/0. The disjoint and sliding blocks estimators for 0 are those that 
maximize the pseudo-likelihood Lea;p(6';X), that is. 


9d = 




( 6 ) 


For convenience we will use 9sp to refer to a general estimator of this type. 

The pseudo-likelihood ([5]) is approximate because ([1]) provides only an approximate re¬ 
lationship between G and F. Serial dependence in the underlying sequence, Xi,... ,Xm is 
expected, resulting in dependence between the values of X from nearby disjoint blocks. Use 
of sliding blocks further complicates matters as successive values of X® are strongly positively 
associated. Fven if ([T]) holds and the underlying sequence is i.i.d., estimation of F from a 
hnite sample introduces a further approximation. Moreover, the double use of the sample to 
estimate F and to provide block maxima, induces depen dence between Vi,... ,Vn that is not 
negligible asymptotically, see for example, [Robert] (120091) . Therefore, the expression given for 
var(0i7’) at the start of Section [2] does not apply in either case. 
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We attempt to make some adiust ment for thes e issues by basing estimates of uncertainty 
on information sandwich estimators f Whitel. 1982 ') of the sampling variances of the estimators 
of 9. Details are given in Appendix B. Some unrealistic simplifying assumptions are used, 
such as observat ions from distinct blocks b eing independent, so we use as an alternative a 
block bootstrap f Politis and Romanol. 1994). In common with other estimators of 9, studying 
the asymptotic properties of 9d and 9s is difficult and we do not attempt such an analysis 
here. 


2.1 Desirable properties of these estimators 

The estimator 9sp is simple and non-iterative. Appendix A shows that, in an idealized 
situation where data can be treated as random samples from the respective models, it is 
more efficient than its parametric counterparts. This hnding is supported by a simulation 
study presented in Section 3. The extremal index measures local dependence in extremes 
and is independent of the marginal distribution of the process. Since 9sp is determined by 
the ordering of the ranks of the raw data it is invariant to marginal transformation, whereas 
the parametric alternatives are not. In common with threshold methods, 9 is estimated 
directly, rather than as part of a larger extreme value model, that is, estimation of extremal 
dependence and marginal behaviour are separated. It may be that the assumption (G = 
underlying 9sp is reasonable for smaller block sizes than the parametric GEV assumptions. 
Section [3] gives examples where the rate of convergence of 0;, to 6^ as 6 —)■ oo is 0(1/6). Thus, 
convergence to G = could be relatively fast even if convergence to the limiting GEV form, 
which depends on the marginal distribution, is slow. The semiparametric framework permits 
the use of a relatively small block size, whereas the parametric alternatives do not. 


2.2 Link to the blocks estimator 


he blocks estimator (sometimes called the logs estimator) 9 b = losG(u)/b\ oe:F(u) 


( Smith and Weissman . 1994h requires the choice of a threshold u and a block size b. Robert 


(120091) extends this idea by using a random threshold, set at a particular sample quantile. We 
show that in large samples 9^ gives approximately the same value as a combination of blocks 
estimators, each based on its own local data-dependent threshold. Suppose that we use a set 
of random thresholds Uj = V, where l/G = 1 , ■ ■ ■ ,n are a sample of block maxima over blocks 
of length b. We may think of this as using the data to dehne a set of local thresholds. Each of 
the ratios logG(yi)/61ogF(y)) is an estimator of 9. We combine these using a ratio estimator 


/6 

-5^1ogG«) -Vi°sG(K.) 

?RB = —-. (7) 

b-Flog F(Y,) 

i=l 2=1 


If disjoint blocks are used to construct {V} then G(y(j)) = i/n and, by Stirling’s formula, as 
n —>■ oo the numerator of ([7]) j, —1. Therefore, for sufficiently large n, 9d ~ 9bb- A potential 
advantage of maxima estimators over blocks estimators is that all block maxima contribute 
information directly to the estimator, regardless of whether or not they exceed some threshold 

u. 


2.3 Theoretical comparison with parametric maxima method 

The calculations in Appendix A show that, in an idealized situation where data can be treated 
as random samples from the respective models, 9d has a smaller asymptotic variance than 
9 at- The asymptotic variance of 9 at depends on the marginal distribution of the raw data. 
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via the shape parameter ^ of the GEV distribution assumed for block maxima, whereas the 
asymptotic variance of 6*^ does not. The efficiency of 9at relative to 9d depends on 9 and, 
to a lesser extent, on ^ (see Figured]) but 9at is at best 50% efficient (when 9 = 1). This 
is expected because the resampling used to produce the parametric estimator introduces an 
extra source of variability. 



e 

Figure 1: Asymptotic relative efficiency of the parametric maximum likelihood estimator of 
9 compared to the semiparametric maximum likelihood estimator for ^ = —0.4 (solid lines), 
.^ = 0 (dashed line) and .^ = 0.4 (dotted line). 


3 Simulation studies 

We present two types of simulation study. The first shows that the conjectured superiority 
of the SP estimators relative to existing maxima methods is realised in practice and exam¬ 
ines how best to estimate the sampling variability of the former. The second compares the 
performance of the SP estimators to the most efficient threshold methods. 


3.1 Maxima estimators 


We compare the SP estimators of 9 to the parametric estimators 9 at and 9g for different 
processes, values of 9, marginal distributions and blocks sizes. The proces s es are those for 
which results are presented in Tables 4-8 of lAncona-Navarrete and TawnI (120001) . As the 
general findings ar e the same in all cases w e present results only for a max-autoregressive 
(maxAR) process (jPavis and Resnicki . 119891) : Xi = max{(l — 9)Xi-i,9Zi}, where {Zi} and 
Xq have independent unit Frechet distributions. For this process 9h = 9 + {1 — 9)/b (see 
Appendix C). 

We simulate 500 sequences of length m = 4, 900 and estimate 9b using the semiparametric 
estimators (disjoint and sliding blocks) and 9at and 9g for b = 20, 70, 245 (n = 245, 70, 20). 
To examine the impact of marginal distribution we apply 9at and 9g after transformation 
to Gumbel, Gaussian and exponential margins. In fact the marginal distribution has a rel¬ 
atively small effect on the overall performance of 9at and 9g so we present results only 
for Gumbel margins. However, marginal distribution can have an impact on individual es¬ 
timates. Figure |2] compares the estimates produced by 9at for a moving maxima process 
{Xi = maXj=o,..., 3 {'ai^j-i-i })5 for aj = 1/4, j = 0,..., 3, with Gaussian and Gumbel margins. 
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For small b = 20 {n = 245) the agreement is good but for b = 245 {n = 20) there is large 
disagreement for some datasets. 


b=20 b=245 



Figure 2: Estimates 6 at based on data simulated from a moving maxima process with a = 
(1/4,1/4,1/4,1/4) and 6 = 1/4: Gaussian margins against Gumbel margins. Left: b = 20. 
Right: b = 245. 

Table d] compares the SP estimators to 9at and Oq- Naive standard errors are estimated 
using n9sp{n — — 1)“^ for the SP estimators and Appendix A for 9at- For the SP 

estimators we also estimate adjusted standard errors based on a sandwich estimator (Ap- 
pend ix B) and bootstrap stand ard errors based on 100 stationary blo ck bootstrap resam¬ 


ples fjPolitis and Romanolll994jl with o ptima l blo ck length chosen using 


implemented using iGantv and Riplevi (120141) and iHavheld and Racind (120081) . When using 


Patton et all (120091) . 


disjoint blocks the semiparametric estimator outperforms the parametric estimators approxi¬ 
mately to the extent suggested by Figure [TJ The use of sliding maxima improves this further. 
Gomparison of the mean standard errors (SE) with standard deviations (SD) shows that 
the (non-bootstrap) standard errors tend to be a little too large, that is, the estimators are 
less variable than expected. In general the bootstrap standard errors are more reliable. As 
expected, for the estimators based on sliding blocks the naive standard errors are far too 
small. 

3.2 SP maxima, blocks, intervals and iC-gaps estimators of 9 

We compare the performa nce of the SP maxima estimators to the blocks estimator of 
Smith and WeissmanI (Il994lj and t o two of the leading threshold-based estimators: the inter¬ 
vals e stimator of iFerro and SegersI (120031) and the iP-gaps estimator of lSiiveges and Davison 
(120101) . The general form of the blocks estimator is Ob = log G{u)/b log F{u), for some thresh¬ 
old u and block size b. We consider two blocks estimators: the disjoint blocks estimator uses 
the empirical distribution function of {R/} to estimate G, whereas the sliding blocks estimator 
uses the empirical distribution function of {R/}. 

The threshold-based estimators are based on the marginal distribution of the time T{un) = 
min{/c ^ 1 : | Xi > between two exceedances of threshold Under mild 

conditions, as n —>■ oo the rescaled iP-gap {1 — F(m„)} max{T(M„) — iP, 0} follows a mixture 
model: with probability 1 — 9 the iP-gap is zero, otherwise it has an exponential distribution 
with mean 1/9. The intervals estimator is a moment estimator that (implicitly) uses iP = 0. 
The iP-gaps estimato r is a maximum likelihood e stimator derived by treating successive K- 
gaps as independent. ISiiveges and DavisonI (120101) note that in practice it is important to use 
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b 

est 

6 , 

mean 

RMSE 

SD 

SE 

adj SE 

boot SE 

eff 

20 

d 

0.52 

0.53 

0.028 

0.028 

0.034 

0.033 

0.029 



s 


0.52 

0.023 

0.023 

0.007 

0.031 

0.024 

1.45 


AT 


0.53 

0.037 

0.037 

0.050 



0.56 


G 


0.52 

0.038 

0.038 




0.54 

70 

d 

0.51 

0.51 

0.050 

0.050 

0.061 

0.060 

0.054 



s 


0.50 

0.043 

0.043 

0.007 

0.052 

0.045 

1.38 


AT 


0.51 

0.071 

0.071 

0.091 



0.50 


G 


0.50 

0.075 

0.075 




0.45 

245 

d 

0.50 

0.51 

0.105 

0.105 

0.114 

0.111 

0.108 



s 


0.50 

0.088 

0.088 

0.007 

0.088 

0.087 

1.42 


AT 


0.54 

0.152 

0.148 

0.179 



0.50 


G 


0.51 

0.159 

0.159 




0.44 


Table 1: MaxAR process with 6=0.5. Estimators - d: 6 ^, s: 6*^; AT: 6 at] G: Oq. Sampling dis¬ 
tribution mean, root mean square error (RMSE) and standard deviation (SD), mean standard 
error (SE), sandwich adjusted standard error (adj SE) and boostrap standard error (boot SE) 
and efficiency (ratio of variances) relative to 6 ^ (eff). 


an appropriate value of K and use a model misspecihcation test to assist this choice, and the 
choice of threshold. Based on a simulation study, they hnd that if K is chosen appropriately, 
then the A"-gaps estimator performs better than its com petitors: t he in tervals estimator and 
the iterative weighted least squares estimator (IWLS) of lSiivegesI (120071). _ 

We repeat the simulation study presented in Figure 2 of Stiveges and Davisonl ( 2010 ). 
We simulate 1,000 sequences of length n = 30, 000 from each of the processes: (a) Cauchy 
AR(1). Xi = -|- Zi with (j) = 0.7 and Zt standard Cauchy: 6 = 0.3; (b) Pareto 

AR(2). Xi = (l)iXi-i-\- (l) 2 Xi- 2 -\- Zi, with 01 = 0.95,02 = 0.89 and Zi Pareto with tail index 2: 
6 = 0.25; (c) A Markov chain with Gumbel margins, a symme tric logistic b ivariate distribution 
for consecutive variables and dependence parameter r = 2 fISmithl . |l992|): 6 ~ 0.33. For the 
intervals and iP-gaps estimator we use thresholds corresponding to the 0.95, 0.96, 0.97, 0.98 
and 0.99 empirical quantiles. For the SP maxima and blocks estimators we use block sizes 40, 
60, 80, 100, 120, 150 and 200. The blocks estimators require a block size and a threshold to be 
set. To facilitate comparsion of the SP maxima and blocks estimators we use common block 
sizes and, for a given block size, we use as the threshold the sample median of the disjoint 
block maxima for the disjoint blocks estimator and the sample median of the sliding block 
maxima for the sliding blocks estimator. 

Comparison of maxima estimators and threshold estimators is complicated by the different 
nature of the tuning parameters involved: block size for maxima estimators and threshold 
for threshol d-base d estim ators. To provide a tentative basis for comparison we appeal to a 
result from Smith f 1987h . who, for distributions in the domain of attraction of the Gumbel 
distribution, compared the mean squared error of prediction of extreme quantiles resulting 
from analyses of block maxima and analyses of threshold exceedances. Smith found that 
the optimal sample size (number of exceedances) in the latter is almost double the optimal 
sample size (number of block maxima) in the former. Therefore, in displaying the results of 
the simulation study we use plotting scales that match a proportion of exceedances p with a 
block size of 2 /p. 

In the top three rows of Figure [3] the estimated median relative bias (MRB), standard 
deviation (SD) and root mean squared error (RMSE) of the sliding blocks versions of the 
SP maxima and blocks estimators are compared with the threshold-based estimators. For 
the iP-gaps estimator we pl ot results for the (process-dependent) optimal K determined by 


Siiveges and Davisonl (120101 ) (1 for the Cauchy AR(1), 6 for the Pareto AR(2) and 5 for the 


Markov chain) and for the two values of K closest to the optimal value. 
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Figure 3: Top three rows: relative bias, standard deviation and root mean squared error 
(RMSE) of the sliding blocks SP (labelled N) and blocks (labelled B) estimators, the i^-gaps 
estimator (labels give the value of K] solid line for the optimal K, dotted lines otherwise), the 
intervals estimator (F). Bottom row: RMSE of the SP maxima and blocks estimators (dotted 
lines for disjoint maxima, solid lines for sliding maxima). Left: Cauchy AR(1); middle: Pareto 
AR(2); right: symmetric logistic Markov chain. The upper axis labels give the non-exceedance 
probability for the threshold-based estimators. The lower axis labels give the block size for 
the SP maxima and blocks estimators. 














Bias can be attributed to two sources: lack of convergence of 9b (or 9{u)) to 9 and bias 
in estimation of 9b (or 9{u)). The former depends on the process and if convergence is 
slow then the bias may be a strong determinant of the performance of estimators of 9 even 
for long sequences of data. This seems to be the case for the Pareto AR(2), where the 
RMSE plot mirrors the MRB plot. For the Cauchy AR(1) (and to a lesser extent for the 
Markov chain) the biases are smaller so that the relative variabilities of estimators have more 
influence on the RMSE. For the Pareto AR(2) the A'-gaps estimator suffers from relatively 
large bias and variability if K is chosen to be slightly too small {K = 5), but for the other 
two processes the performance of the A'-gaps estimator is quite insensitive to small deviations 
from the optimal K. Although choices of block size and threshold make direct comparison 
difficult, the SP maxima estimators are competitive with the threshold-based estimators. 
They have relatively large bias for small block sizes but their low sampling variability results 
in a relatively small RMSE for larger block sizes. The SP maxima estimator has lower SD 
than the blocks estimator, but, particularly for the smaller block sizes, a larger MRB. The 
SP maxima estimator uses the data to set local thresholds, whereas the blocks estimator has 
a constant threshold, which here we have set at the median of the local thresholds. In this 
instance, the net effect is that the SP estimator trades a reduction in SD for an increase in 
MRB. 


In the bottom row of Figure [3] the disjoint blocks and sliding blocks version of the SP 
maxima and blocks estimators are compared. The main advantage of using sliding blocks is 
a reduction in SD. For the Cauchy AR(1) and the Markov chain this effect is apparent in the 
RMSE. However, for the Pareto AR(2), where bias dominates, the improvement in RMSE is 
minimal. 

Figure 0] shows the results of extending the study to three more processes: a Gaussian 
AR(1) process: (d) Xj = aXj_i -|- e,, where {ei} are independent N(0,1 — a^), Xq r'U X(0,1) 


and we assume |a| < 1 for second-order stationarit y. This process exhibit s serial dependence 
but limiting extremal independence because 9 = 1 flLeadbetter et ai Il983l. chanter 4): (e) the 
maxAR process of Section [3Tl 9 = 0.5; (f) a moving maxima process f Dehenv^ . 1983): Xj = 

= 1 . 


,p-i, with 


-0 


maXj=Q^ p{ajZi_^_j}, where ao > 0, > 0 and aj ^ 0, for j = 1, 

9 = m ax,;=n pfad. We consider the case a = (0.3, 0.2, 0.2, 0.3) flAncona-Navarrete and Tawnl . 
2000h so that 9 = 0.3. 


The hndings echo those from Figured The SP estimators are competitive with the thresh¬ 
old estimators and the X-gaps estimator only performs better than the other estimators if 
K is selected appropriately. In the maxAR and moving maxima examples the SP estimator 
fares no worse than the blocks estimator in terms of MRB and better in terms of SD. In the 
Gaussian AR(1) case all estimators underestimate the limiting value 0 = 1 to the extent that 
bias dominates the RMSE. The blocks estimators have less bias than the other estimators 
and therefore have the lowest RMSE of all the estimators. 


4 Example: Newlyn sea-surges 


Figure [5] shows a series of 2894 measurements of sea-surge heights taken just off the coast 
at Newlyn, Gornwall, UK, over the perio d 1971 - 1976. The data are the maxirnum h ourly 
surge heights over periods of 15 hours (see iGolesI fjl99ll) ). iFawcett and Walshawl (120121) used 


several estimators, including the parametric maxima estimator of iGomesI fjl993l) . to estimate 
the extremal index of the underlying process using several estimators. We use 9sp to estimate 
the extremal index of this series, based on series of disjoint and sliding block maxima. We 
also £t a GEV distribution to the block maxima in order to make inferences about extreme 
quantiles of the marginal distribution of sea-surge heights at Newlyn. 
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Gaussian AR(1) maxAR moving maxima 



Figure 4: Top three rows: relative bias, standard deviation and root mean squared error 
(RMSE) of the sliding blocks SP (labelled N) and blocks (labelled B) estimators, the i^-gaps 
estimator (labels give the value of K] solid line for the optimal K, dotted lines otherwise), 
the intervals estimator (F). Bottom row: RMSE of the SP maxima and blocks estimators 
(dotted lines for disjoint maxima, solid lines for sliding maxima). Left: Gaussian AR(1); 
middle: maxAR; right: moving maxima. The upper axis labels give the non-exceedance 
probability for the threshold-based estimators. The lower axis labels give the block size for 
the SP maxima and blocks estimators. 
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Figure 5: Time series plot of 2894 maximum sea-surges measured at Newlyn, Cornwall, UK 
over the period 1971-1976. The observations are the maximum hourly sea-surge heights over 
contiguous 15-hour time periods. 


The top left plot in Figure E] shows 9sp against block size b based disjoint and sliding 
maxima. Also give n are 95% conhdenc e inter vals for 6 , based on (vertically-scaled) adjusted 
log-likelihoods, see IChandler and Batel (120071. page 182). The other plots in Figure E] show 
maximum likelihood estimates for GEV distributions htted to the disjoint and sliding maxima, 
with, for the disjoint maxima only, symmetric 95% confidence intervals. As the location and 
scale of the GEV distribution depend on b we have plotted estimates of the GEV parameters 
implied for a block size of 1, i.e. the marginal distribution of the data. 

For these data 6 = 20 is reasonable. Table [2] shows estimates, standard errors and 95% 
conhdence intervals for 6 using this block size. The bootstrap estimates result from the 
approach detailed in Section [3T] using 10,000 resamples. The acc uracy of bootstrap con h dence 
intervals can depend on the parameter scale chosen. Following iDavison and Hinklevi (119971. 
Section 5.2) we seek a monotone variance-stabilizing transformation h{9), with the property 
that va.T[h{6sp)] is approximately constant with respect to h{6). It is also often the case 
that bootstrap estimates of h{6) are closer to being normally distributed than estimates of 6 . 
From the start of Section [2] we have var(6'ir) oc 6*^, which suggests that we use h{9) = log6'. 
The expression for var(6'i?) is not correct in practice, but it may suggest an ehective variance- 
stabilizating transformation. We construct bootstrap conhdence intervals for log 9 and then 
transform them back to the 0-scale. Basic conhdence intervals are given in Tabled but as the 
bootstrap distributions of log 05 p are indeed very close to being normally distributed these 
intervals are very similar to normal intervals. Bootstrap bias-adjustment results in a slightly 
smaller poin t estimate of 9 . _ 

Following iGomesI (119981) . iFawcett and Walshawl (120121) used the larger block size y/m ~ 54, 
obtaining an estimate of 0.282 with a standard error of 0.206. For this block size the 9sp 
compares favourably, with estimates (and adjusted standard errors) of 0.269 (0.044) using 
disjoint blocks and 0.245 (0.0 40) using sliding bloc k s. Th e bootstrap standard errors are 
0. 047 and 0.039 respective ly. iFawcett and Walshawl (120121) also use the intervals estimator 
of Ferro and SegersI ( 20031) . based on a threshold of 0.3m selected using a mean residual life 
plot, obtaining 0.223 (0.050). The standard errors in Table |2] suggest that, at least for these 
data, 9sp is competitive with the intervals estimator when both approaches are allowed to 
select their tuning parameter (block size for 9sp and threshold for the intervals estimator) 
using the observed data. 
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Figure 6: Block size selection for the Newlyn data. Top left: estimates and 95% confidence 
intervals for 6 based on disjoint maxima (solid lines) and sliding maxima (dashed lines). Other 
plots: estimates of marginal {b = 1) GEV parameters based on disjoint block maxima (solid 
lines) and sliding maxima (dashed lines). Symmetric 95% conhdence intervals are also given 
for the disjoint maxima. 


Figure [7] shows estimates and 95% conhdence intervals of high quantiles of the marginal 
distribution of sea-surge height. The sum of the adjusted log-likelihood for 9 based on sliding 
maxima and the log-likelihood for the GEV parameters based on disjoint maxima is prohled 
with respect to the desired quantile. The estimates (and standard errors) of the GEV pa¬ 
rameters He, 0-0 and ^ are 0.192 (0.012), 0.130 (0.0085) and —0.0546 (0.056) respectively. The 
underestimation that would result from assuming that 6^ = 1 is clear. 


5 Discussion 


The semiparametric maxima estimators proposed in this paper improve substantially the 
existing maxima methods of estimating the extremal index, to the extent that they are com¬ 
petitive with threshold methods. The simulation studies in Section |3] showed that there is 
beneht to using sliding blocks rather than disjoint blocks. Apart from the point estimates 
in Figure |6] we did not u se sliding blocks for the GEV analysis in Section 01 As noted by 
Ferro and Pezznllil (120051) . who employ an approach that is similar to sliding blocks, further 
research is required to determine how best to provide estimates of uncertainty from analyses 
based on sliding blocks. 

If the main extreme value analysis is threshold-based then, once the threshold has been set, 
the intervals estimators and the IWLS estimator do not require another tuning parameter to 
be specihed. In contrast the iF-gaps estimator and the SP maxima estimators do. However, 
Siiveges and DavisonI (120101) shows that there is potential beneht in choosing K empirically, 
jointly with the threshold. Future work could seek to optimize the choice of block size b 
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SE(0) 

95% Cl 


naive 

0.241 

0.020 

(0.204, 0.283) 

disjoint 

adjusted 

0.241 

0.026 

(0.194, 0.295) 


bootstrap 

0.219 

0.027 

(0.179, 0.265) 

sliding 

adjusted 

0.238 

0.028 

(0.188, 0.296) 

bootstrap 

0.213 

0.023 

(0.180, 0.251) 


Table 2: Estimates, standard errors and 95% confidence intervals for the extremal index 6 
of the Newlyn data using (disjoint or sliding) blocks of size of 20. Naive: independence 
log-likelihood; adjusted: adjusted log-likelihood; bootstrap: stationary block bootstrap. 



Figure 7: Estimates and 95% confidence intervals for the 100(1 — p)% marginal quantile Xp 
against 1/p. Solid lines: inferring 6 using 6sp based on sliding maxima. Dashed lines: using 

e = i. 

empirically. 
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Appendix A: asymptotic efficiencies of semiparametric 
and parametric estimators using disjoint blocks 


The semiparametric estimator 6sp is based on a sample Vi,... ,Vn treated as randomly sam¬ 
pled from an exponential distribution with mean 1/9. The log-likelihood for a single obser¬ 
vation V is l{9) = log 6* -|- 9v. Thus, the asymptotic precision of 9sp is -l"{9) = l/9\ 

The parametric estimator 9at of Ancona-Navarrete and Tawn f 2000l) treats as independent 
two random samples, each of size n. Sample 1 is from a GEV(/i, a, ^) distribution and sample 
2 is from a GEV{pg,ae,^) distribution. Here we take n = 1. Let denote the Fisher 
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information matr ix for a sample of size 1 from a GEV(/i, cr, distribution. This matrix can 
be inferred from iPrescott and WaldenI fll980h . who use a shape par ameter k = —The 
calculation of the asymptotic precision of Oat requires that ^ > —1/2 (jSmithl . 1198511 . 

Let 1 1 and I 2 denote the respective Fisher information matrices for the parameter vector 
( 6 *, from samples 1 and 2 . Ji is given by 


'' ( 0 ^ 

where 0 = (0, 0, 0). Let xf) = (/i^, cr^, and r] = {0, and Aij = dxpi/d'qj. I 2 is given by 

I 2 = A^/(/i 0 , (Tg, ^)A. The total information is / = I 1 + I 2 , giving 

^ ( i/e^ w \ 

^ Igev J ’ 

where in is a vector with non-zero entries and Igev is the total Fisher information for (/r, a, ^). 
Block inversion of I gives the asymptotic precision of Oat as 

prec(0AT) = l/^^ - >1/0^ = pTec{0MLE), 

the inequality following because Iqev positive dehnite. 


Appendix B: estimating the sampling variances 

The sandwich estimator of the sampling variance of 0 is II{0)~^V{0)J'{0)~^, where the ob¬ 
served information II{0) = n/0’^ and V{0) is an estimate of the variance of the score function. 
Using the notation dehned in Section [2] the log-likelihood is 

n 

m = Y,{^ogo-ov^. 

i=l 

The score function is 

n n n 

m = («”' -1,) = r' ^ (1 -«{>,). 

2 = 1 2=1 2=1 

The variance V{0) = var{f/( 6 ')} of the score satishes 

0^V(0) = varJ^(^l- 0 u) 

I 2=1 

n n j—1 

= ^var (^ 1 - 0 U) + 2 ^^ COY (l-0Vi,l-0Vj^ . ( 8 ) 

2=1 j=2 2=1 

The hrst term of (jH]) is estimated by ~ ^ ■ To estimate the covariance in the second 

term we ignore the possibility that either {1/ < /j} or {Yj < Ij}, as their contributions to the 
covariance are negligible. Let 
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where 


T. 


kGBjDB^ 


S, = Y liX.iY,), 

k^BiUBj 

T, = E 

fcGBinS| 


In the following we make the simplifying assnmption that {X^, /c G -Bj} -LL {Xk, k G Bj}, for 
i 7 ^ j, that is, data from distinct blocks are independent. Under this assnmption, if disjoint 
blocks are nsed then Si an Sj, Si jj_ Tj and Sj jj. Tj, becanse in each case a block of Xs, and/or 
the maximnm of these Xs, are compared to Xs from two other disjoint blocks. Then 


cov (1 - eVi, 1 - eVj) = rcov(u, Uj), 

O'^b^cov 


^ {Si+T,)-1, , (g,+T,)-l \, 


{m — b + 1)^ 


m —6+1 

cov {Ti,Tj), 


m —6+1 


where we have used log a; Ri a; — 1 for x ~ 1. If {Yi > Yj} then Ti = b and Tj < 6 and if 
{Yi <Yj} then Ti < b and Tj = 6. Thus, (Tj — b){Tj — 6) = 0 and 


cov(Ti, Tj) = cov(Ti - 6, Tj - 6), 

= E[(Tj-6)(T,-6)]-E(Tj-6)E(T,-6), 

= -|E(T)-6f, 

where E(T) = 6P(X ^ Y) = b'^9/{b9 + 1). As n{n — l)/2 pairs of blocks contribute to the 
second term of ([8]) it is estimated by —n{n — l)6*^6‘^/(m — 6 + 1)^(66* + 1)^. 

For sliding blocks we note that the second term of ([8j) contains contributions from pairs 
of blocks that overlap and (n — 6)(n — 6 + l)/2 pairs that do not. For the latter the total 
contribution is estimated by — (n — 6)(n — 6 + 1)6*^6^/(m — 6 + 1)^(66* + 1)^. We estimate the 
total contribution of the former by 2 X]fc=i J2i=i Ui{9)Uij^ki^). Thus, the estimators of V{9) 
using disjoint and sliding blocks are respectively 


and 


W) = 0-^ \ 5^(1 - - 


n[n 


l)Pb^ 


i=l 


(m — 6 + 1)2(60 + 1)2 




6—1 n—k 


1-2 


^(1 - 9 i/j)2 + 2 UmUi+k(S) 


i=l 


k=l i=l 


{n - b){n - b + l)9%^ \ 

(m-6 +1)2(60+ 1)2 J ■ 


In practice the contribution to the score function from the largest block maximum is 
non-random, because V(„) = —6 log[(m — 6)/(m — 6 + 1)]. We adjust for this by removing from 
Vd{9) and Vs(0) contributions from V(„). 


Appendix C: for two processes 


In the following Xj, i = 1, 2,... are independent unit Frechet random variables with P{X ^ x) 
= exp(—1/x), for X > 0. Thus, F^ juh) = 1/2 implies th at Ub = 6/log2. 

The maxAR process. Following f Beirlant et ah 20041 chapter 10) 


G(x) = P(Xi ^ X, ..., Xf, ^ x), 

= P(Xi ^ X, 9 Z 2 ^ X, ..., 9Zb ^ x), 
= exp {—[1 + 0(6 — 1)]/x} . 
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Therefore, 9b = — log G{ub )/ log2 = 6* + (1 — 6*)/6. 

The moving maxima process. Let = max(Q;o, • • •, cti) and a~ = max{ap,... ,ai). Then, 
for b ^ p, 


G{x) = P{Xi ^ X,Xb ^ x), 


= P 


— 7 7 < — 


7 < 7 < ^ 


a, 


a. 


p-i 


012 


012 


tti 


) Zp-i^b ^ 


X 


a' 


p—i 


= exp ^ a+ + (6 - p)a+ + ^ 


a. 


X 


J=0 


i=l 


Therefore, 


' p—i 


Ob = a++-i^af+ -pa+ = 9 + cib, 


.1=0 


i=l 


where 1 — 6^ ^ c ^ p9. The lower bound is achieved if j = argniaXj{Q;j} ^ {0,p} and 
ttj = {l—9)/p for i 7 ^ j, or if {oj} are monotonic in i, and the upper bound when ao = Up = 9. 
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